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Abstract 

The previously proposed semariizc-head-driven gen- 
eration methods run into problems if none of the 
daughter constituents in the syntacto-semantic rule 
schemata of a g rammar fits the defin ition of a semantic 
head given in [ Bhieber et ai, 1990 1. This is the case 
for the semantic analysis rules of certain constraint- 
based semantic representations, e.g. Undersp ecified 
Discourse Repre sentation Structures (UDRSs) ]Frank 
and Reyle, 1992 1. 

Since head-driven generation in general has its mer- 
its, we simply return to a syntactic definition of 
'head' and demonstrate the feasibility of syntactic- 
head-driven generation. In addition to its generality, 
a syntactic-head-driven algorithm provides a basis for 
a logically well-defined treatment of the movement of 
(syntactic) heads, for which only ad- hoc solutions ex- 
isted, so far. 

1 Introduction 

Head-driven generation methods combine both, top- 
down s earch and bottom-up combination, in an ideal 
way. [ Bhieber et ai, 1990 ] proposed to define the 



'head' constituent h of phrase with category x on se- 
mantic grounds: the semantic representations of h 
and x are identical. This puts a strong restriction 
on the shape of semantic analysis rules: one of the 
leaves must share its semantic form with the root 
node. However, there are composition rules for seman- 
tic representations which violate this restriction, e.g. 
the schemata for the construction of Underspecified 
Discourse Repre sentation Structures (UDRSs) [ Frank 
and Reyle, 1992| ] where, in general, the root of a tree 
is associated with a strictly larger semantic structure 
than any of the leaves. In order to make a generation 
method available for grammars which do not follow 
the strict notion of a semantic head, a syntactic-he&d- 
driven generation algorithm is presented, which can 
be specialized to generate from UDRSs. In a second 
step, the method will be extended in order to han- 
dle the movement of (syntactic) heads in a logically 
well-defined manner. 

The (tactical) generation problem is the task to 
generate a string from a semantic representation ac- 
cording to the syntax-semantics-relation defined in a 
given grammar. Let's assume that the latter relation 

*The research reported here has been funded by the Sondcr- 
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Foundation DFG. 



is stated by pairs of trees. The left tree states a lo- 
cal syntactic dependency, i.e. the dominance relation 
between a root node and a set of leaf nodes and the 
linear precedence relation among the leaves. The right 
tree defines the relation among the semantic represen- 
tation of the root and the semantic representations of 
the leaves. We assume that there is a one-to-one map 
from the nonterminal leaf nodes of the (local) syntax 
tree on the leaf nodes of the (local) semantic derivation 
tree. Example: 



s NP(VP) 
np^p NL^VP 



(1) 



If one assumes a pairwise linking from left to right 
then the links between the two trees can be omitted. 
Although such pairs of t rees are reminiscent of syn - 
chronous trees in TAG's [ rihicber and Schabcs, 1991 1, 
they are simpler in various ways, in particular be- 
cause we will not make use of the adjunction operation 
later on. In essence, pairs of trees are just a graph- 
ical notation for what has been put forward as the 
'rule-to-rule'-hypothesis, cf. [Gazdar et ai, 1985], the 
fact that in the grammar each syntax rule is related 
with a semantic analysis rule. However, on the long 
run, the tree notation suggests a more general relation, 
e.g. more internal structure or additional, terminal leaf 
nodes in the local syntax tree. 

An obvious way to implement a generation proce- 
dure (see Fig.|l|) is to relate the input semantics with 
the start symbol of the grammar and then to try to ex- 
pand this node in a top-down manner according to the 
rules specified in the grammar. This node expansion 
corresponds to an application of the (predict)-iule in 
the following abstract specification of a top-down gen- 
erator. Generation terminates successfully if all the 
leaf nodes are labeled with terminals (success). The 
question is which method is used to make two, possibly 
complex symbols equal. For the sake of simplicity, we 
assume that the open leaves xo resp. Xo are matched 
by (feature) term unification with the corresponding 
mother nodes in the grammar rule. However, for the 
semantic form Xo, a decidable variant of higher order 
unification might be used instead, in order to include 
the reduction of A-expressions. Of course, the neces- 
sary precautions have to be taken in order to avoid the 
confusion between object- and meta-level variables, cf. 
[ phicb er et al, 1990j. 



A depth-first realization of this abstract top-down 
algorithm would work fine as long as the semantic rep- 



all leaves of the syntax tree are labeled with terminals 



(success) 




resentations of the leaves are always strictly smaller in 
size as the semantic form of the root node. But, if the 
actual semantic decomposition takes place in the lexi- 
con, the semantic representations of some subgoals will 
be variables, which stand for semantic representations 
of any size: 



(2) 





lambda : [Y] 
sem : X 

lambda : [Y] 

sem : walk(Y) 



A strict left-to-right, depth- first expansion of subgoals 
might run into problems with the grammar fragment 
in (g) if a left-recursive np-rule exists, because the se- 
mantics of the np is only instantiated once the 'seman- 
tic head' of the vp has been looked up in the lexicon. 

2 Previous work 

A top-down, semantic-structure-driven generation al- 
gorithm has been defined by [ Wedekind, 198S ] which 
gives a basis for dynamic subgoal-reordering guided by 
the semantic input. Some proposals have been made 
for subgoal reordering at compile-time, e.g 



mncn 



et al., 1993 elaborating on the work by [ Btrzalkowski 
1990[ . But there will be no helpful subgoal reordering 



for rules with semantic head recursion: 



vp 



vp np 




(3) 



Obviously, a bottom-up component is required. One 
solution is to keep to a to p-down stra tegy but to do 
a breadth- first search, cf. [ Kohl, 1992 ], which will be 
fair and not delay the access to the lexicon forever, 
as a pure depth-first strategy does. Alternatively, one 
could adopt a pure bottom- up strategy l ike the one 
which has been proposed in [shieber, 1988] and which 
is presented in Figj2| in a highly schematic manner. A 
lexical entry qualifies as a potential leaf node if its se- 
mantic form is a non-trivial substructure of the input 
semantics (rule (lex)). The derivation trees are built 
up by the ( complete) '-rule. Generation finally succeeds 



if the root node of the current syntax tree is labeled 
with the start symbol of the grammar and the root of 
the semantic analysis tree with the input semantics. 
Due to the exclusion of phrases with 'empty' seman- 
tics (which would be trivial substructures of the input 
semantics), the method always terminates. However, 
the lack of top-down guidance will lead, in general, 
to a lot of non-determinism. The strong substructure 
condition means that the algorithm will be incomplete 
for grammars which cover semantically void phrases 
like expletive expressions, particles, and subphrases of 
idioms. 



The head-corner generator in (van Noord, 1993 



is 

an illustrative instance of a sophisticated combina- 
tion of top-down prediction and bottom-up structure 
building, see Fig.||. The rule (lex) restricts the selec- 
tion of lexical entries to those which can be 'linked' 
to the local goal category (visualized by a dotted 
line). According to van Noord, two syntax-semantics 
pairs are linkable if their semantic forms are identical, 
i.e. link^x, X), (xi, X)). The rule (he- complete) per- 
forms a 'head-corner' completion step for a (linked) 
phrase Xh, which leads to the prediction of the head's 
sisters. A link marking can be removed if the linked 
categories resp. the linked semantic forms are identical 
(rule (local-success)). Generation succeeds if all the 
leaves of the syntax tree are labeled with terminals 
and if no link markings exist (rule (global-success)). 
In order to obtain completeness in the general case, 
the inference schemata of the head-corner generator 
must be executed by a breadth-first interpreter, since 
a depth-first interpreter will loop if the semantic anal- 
ysis rules admit that subtrees are associated with se- 
mantic forms which are not proper substructures of 
the input semantics, and if these subtrees can be com- 
posed recursively. Such an extreme case would be a re- 
cursive rule for semantically empty particles: ('empty' 
semantics is represented by the empty list symbol [] ) : 



part X\ 
part part X\ X-i 



part 



□ 



(4) 



However, if we assume that structures of that kind do 
not occur, a depth-first interpreter will be sufficient, 
e.g. the inference rules of the algorithm can be encoded 
and interpreted directly in Prolog. Note that van No- 
ord's method is restricted to grammars where phrases 
have always a lexical semantic head. The algorithm in 
[ [Bhicber et al, 19*90 relaxes this condition. 
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Figure 2: Bottom-Up Generation (G grammar description; s start symbol; X input semantics; Xi syntactic 
category; Xi semantic representation) 



all leaves are labeled with terminals and the tree does not contain any dotted lines (global-success) 
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Figure 3: Head-Corner Generator (G grammar description; x; t syntactic category; Xi semantic representation) 
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Figure 4: A grammar with UDRS-construction rules - lexicon 



3 Underspecified Discourse 
Representation Structure 

In the following, we will present shortly a semantic 
representation formalism and a corresponding set of 
analysis rules which resist to the definition of 'se- 
mantic head' as it is required in van Noord's head- 
corner algorithm. [ [Reyle, 1993 ] developed an infer- 
ence system for Underspecified Discourse Represen- 
tation Structures (UDRS's), i.e. Discourse Represen- 
tation Structures | |Kamp and Reyle, 1993 which are 
underspecified with respect to scope. The following 
UDRS represents simultaneously the two readings of 
the sentence 'every woman loves a man' by leav- 
ing the exact structural embedding of the quantified 
phrases underspecified. 




(5) 



An arrow pointing from X2 to X\ is called a subor- 
dination constraint and means that the formula X2 



must not have wider scope than X\ . \ Frank and Reyle 



1992 j proposed rules for the construction of UDRS's 



in an HPSG-style syntax, cf. [pollard and Sag, 1993 |, 
which are shown in Fig |4] and ^| in a somewhat adapted 
manner. Semantic composition is performed by the 
coindexing of the features dref, res, subj, etc. which 
serve as an interface to the value of the sem feature, 
the actual semantic representation. For the phrase- 
structure tree rooted with s, there is no leaf which 
would fulfill the definition of a semantic head given 
m ]Shiebcr et al, 199C| ] or [ van Noord, 1994 1 . Hence, 
the head-corner generator of Fig|3j with a link relation 
based on semantic heads will not be applicable. 



4 Syntactic-head-driven genera- 
tion 

4.1 A new link relation 

One could define a weak notion of a semantic head 
which requires that the semantic form of the semantic 
head is a (possibly empty) substructure of the root 
semantics. But this is rather meaningless, since now 
every leaf will qualify as a semantic head. As a way 
out, there is still the notion of a syntactic head, which 
can serve as the pivot of the generation process. 

Assume that the syntactic head leaf for each local 
syntax tree has been defined by the grammar writer. 
We get the following preliminary version of a syntax- 
based link relation: 



link((x,X), (xi,Xi)) 



(0) 



1. if either x = Xi 

2. or Xj is a possible syntactic head of x 

and link((xj, Xj), (xi, Xi}) 

This is the kind of link relation which is used for pars- 
ing. In general, it works fine there, because with each 
lexical lookup a part of the input structure, i.e. of the 
input string, is consumed. In order to reduce the num- 
ber of non-terminating cases for generation, a similar 
precaution has to be added, i.e. the input structure 
has to be taken into account. The final version of a 
syntax-based link relation incorporates a test for the 
weak notion of a semantic head: 

link({x,X),{x i ,X i )) if (7) 

1. either x = Xi and 

Xi is a (possibly empty) substructure of X 

2. or Xj is a possible syntactic head of x 
and link((xj, X), (x^Xi)) 

The substructure check makes only sense if the seman- 
tics X of the current goal is instantiated. This might 




Figure 5: A grammar with UDRS-construction rules - syntax rules 



not be the case, when the proper semantic head and 
the syntactic head differ, and a sister goal of the se- 
mantic head is to be expanded before the head itself. 
Hence, in general, the sister goals must be reordered 
according to the degree of instantiation of their se- 
mantic representations. In addition to the improved 
termination properties, the condition on the seman- 
tic representation helps to filter out useless candidates 
from the lexicon, i.e. lexical entries which will never 
become part of the final derivation because their se- 
mantic representations do not fit. 

4.2 Grammars with head movement 

In order to simplify the representation in the following, 
we assume that each syntax tree in a grammar is iso- 
morphic to the corresponding semantic analysis tree. 
This means that both trees can merged into one tree 
by labeling the nodes with syntax-semantics-pairs: 



like 



(xq-Xq) 

(xi,Xi) (x 2 ,X 2 ) 



(8) 



In [ Bhieber et ai, 1990 1 an ad- hoc solution was pro- 
posed to enforce termination when the semantic head 
has been moved. By adopting a syntactic-head-driven 
strategy, head-movement does not cause a problem if 
the landing site of the head is the 'syntactic head' (or 
rather: the main functor category of the clause, in 
categorial grammar terminology) of a superordinate 
clause. This is postulated by syntactic descriptions 



(cp f ,X ) 



spec 



I (v P ,X ) /L{vuXi)1 
{Vi,Xi} vpj 




c Pf/tv Pj ] 



(9) 

where V P/[ V ~\ means that the derivation of the vp- 
node has to include an empty u-leaf. In the example 
in Fig.^, the syntactic head (the c-position) of the cp / 
will be visited before the vp is to be derived, hence the 
exact information of the verb trace will be available 
in time. Similarly for the movement to the 'vorfcld'. 
However, if verb second configurations are described 
by a single structure 



(cp Sl X Q ) 



{cp f ,X ) 




(10) 



the algorithm runs into a deadlock: the vp-node can- 
not be processed completely, because the semantics of 
the XP-trace is unknown, and the expansion of the 
XP-filler position will be delayed for the same reason. 
If this syntactic description had to be preferred over 
the one in (|9|), the link relation should be further mod- 
ified. The substructure test wrt. the semantics of the 
current goal should be replaced by a substructure test 




wrt. the global input semantics, which leads to a loss 
of flexibility, as it has been discussed in connection 
with the pure bottom-up approach. 

4.3 Implementation 

Since the algorithm has been implemented in the CUF 
language^, which includes a wait-mechanism, the re- 
odering of subgoals can be delegated to CUF. 

Instead of a full-blown substructure test which 
might be quite complicated on graphs like UDRS's, 
only the predicate names (and other essential 'seman- 
tic' keywords) of the lexical entry are mapped on the 
current goal semantics. If such a map is not feasible, 
this lexical entry is dropped. 

We restrict the grammars to lexicalized ones. A 
grammar is lexicalized if for every local syntax tree 
there is at least one preterminal leaf, cf. [Schabes and 



Waters, 1993 j . Note that lexicalization does not affect 



the expressibility of the grammar |Bar-Hillel et al. 
1960], [Schabes and Waters, 1993]. However, the gen- 



eration algorithm turns much simpler and hence more 
efficient. There is no need for a transitive link relation, 
since a goal can match immediately the mother node 
of a preterminal. The lexicon access and the head- 
corner completion step can be merged into one rule 
schemaQ. 

A version of the Non-Local-Feature principle of 
HPSG has been integrated into the algorithm. Every 
non-head nonterminal leaf of a local tree must come 
with a (possibly empty) multiset of syntax-semantics 
pairs as the value of its to_bind: slash-feature (fea- 
ture abbreviated as /), cf. example @. From these 
static values, the dynamic inherited: slash- values 



1 The CUF-system is an implementation of a theo rem nrover 
For a, Horn c lause logic with typed feature terms |[> 



orre an™ 



Dorna, 1993 

2 An instance of our head-corner generator (without an inte- 
grated treatment_oXinovement) is the UCG-generator by Calder 



et al. [ Calder et al. 
transformation rules 



1989 



(modulo the use of unary category 
which relies, in addition, on the symme- 
try of syntactic and semantic head. A syntactic-head-driven 
generator for a kin d of lexical ized grammars has been proposed 
independentl y by I Kay, 199.3[. Anoth er variant of a lexicalized 
grammar by [Dymetman et al., 1990| does not make use of the 



(feature abbreviated as //) can be calculated during 
generation, see rule (lex) in Fig.^. 

(la) Choose a lexical entry as the head Xh of the 
current goal xo ■ Then the substructure condition must 
hold for the corresponding semantic forms Xh and Xq. 
The /'-value Th must be empty. 

(lb) Or choose an element of the /'-value To of 
the current head xq. Then the /-value Th becomes 
[(xh, Xh)1 . The associated string Wh is empty. 

(2) There must be a lexicalized tree which connects 
the goal xo and the chosen head Xh ■ The /-value To is 
split into disjoint sets Xi, . . . ,T n . The /-values of the 
new subgoals x\, . . . , x n are the disjoint set unions 
Ti ttl T[ where T[ is the /-value of Xi in the local tree 
given in the grammar. 

Note that this version of the Non-Local-Feature 
principle corresponds to the hypothetical reasoning 
mechanism which is provided by the Lambek catego- 



rial grammars [Lambek, 1958 , [Konig, 1994 1 . This is 



head-corner idea but rather corresponds to the top-down gen- 
eration schema presented in Fig.[l]. 



illustrated by the fact that e.g. the left tree in example 
(^|) can be rendered in categorial grammar notation as 
cpf /(yp/v). Hence, the algorithm in Fig.^ has a clear 
logical basis. 



5 Conclusion 

This paper gives a syntactic-head-driven generation 
algorithm which includes a well-defined treatment of 
moved constituents. Since it relies on the notion of 
a syntactic head instead of a semantic head it works 
also for grammars where semantic heads are not avail- 
able in general, like for a grammar which includes se- 
mantic decomposition rules of (scopally) Underspec- 
ified Discourse Representation Structures. By using 
the same notion of head both for parsing and for gen- 
eration, both techniques become even closer. In ef- 
fect, the abstract specifications of the generation algo- 
rithms which we gave above, could be read as parsing 
algorithms, modulo a few changes (of the success con- 
dition and the link relation). 

Generation from Underspecified DRS's means that 
sentences can be generated from meaning represen- 
tations which have not been disambiguated with re- 
gard to quantifier scope. This is of particular impor- 
tance for applications in machine translation, where 



all leaves are labeled with terminals 



(success) 




■{x ,Xo)ff T 



(lex) 



tx^Xi) //Ti &t{--- ( X h> x h)//T h ■ ■ ■ x «) l/T n ylf 



w h 

(xh,X h ) 

1. if EG and Xh substructure of X and Th := [] 

w h 



or if (xh,X h ) e T and T h := [(xh,X h )] and w h := e 

(xo,X ) 



2. and 



G G and T := T t th) . . . tbl T n 



Figure 7: Head-Corner Generator for lexicalized grammars (G grammar description; Xi syntactic category sym- 
bol; Xi semantic representation; T slash-values) 



one wants to avoid the resolution of scope relations as 
long as the underspecified meaning can be rendered in 
the source and in the target language. Future work 
should consider more the strategic part of the genera- 
tion problem, e.g. try to find heuristics and strategies 
which handle situations of 'scope mismatch' where one 
language has to be more precise with regard to scope 
than the other. 
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